86 research outputs found
Seeding with Costly Network Information
We study the task of selecting nodes in a social network of size , to
seed a diffusion with maximum expected spread size, under the independent
cascade model with cascade probability . Most of the previous work on this
problem (known as influence maximization) focuses on efficient algorithms to
approximate the optimal seed set with provable guarantees, given the knowledge
of the entire network. However, in practice, obtaining full knowledge of the
network is very costly. To address this gap, we first study the achievable
guarantees using influence samples. We provide an approximation
algorithm with a tight (1-1/e){\mbox{OPT}}-\epsilon n guarantee, using
influence samples and show that this dependence on
is asymptotically optimal. We then propose a probing algorithm that queries
edges from the graph and use them to find a seed set with the
same almost tight approximation guarantee. We also provide a matching (up to
logarithmic factors) lower-bound on the required number of edges. To address
the dependence of our probing algorithm on the independent cascade probability
, we show that it is impossible to maintain the same approximation
guarantees by controlling the discrepancy between the probing and seeding
cascade probabilities. Instead, we propose to down-sample the probed edges to
match the seeding cascade probability, provided that it does not exceed that of
probing. Finally, we test our algorithms on real world data to quantify the
trade-off between the cost of obtaining more refined network information and
the benefit of the added information for guiding improved seeding strategies
ALLOCATIONS IN LARGE MARKETS
Rapid growth and popularity of internet based services such as online markets and online advertisement systems provide a lot of new algorithmic challenges. One of the main challenges is the limited access to the input. There are two main reasons that algorithms have limited data accessibility.
1) The input is extremely large, and hence having access to the whole data at once is not practical.
2) The nature of the system forces us to make decisions before observing the whole input.
Internet-enabled marketplaces such as Amazon and eBay deal with huge datasets registering transaction of merchandises between lots of buyers and sellers. It is important that algorithms become more time and space efficient as the size of datasets increase. An algorithm that runs in polynomial time may not have a reasonable running time for such large datasets. In the first part of this dissertation, we study the development of allocation algorithms that are appropriate for use with massive datasets. We especially focus on the streaming setting which is a common model for big data analysis. In the graph streaming, the algorithm has access to a sequence of edges, called a stream. The algorithm reads edges in the order in which they appear in the stream. The goal is to design an algorithm that maintains a large allocation, using as little space as possible. We achieve our results by developing powerful sampling techniques. Indeed, one can implement our sampling techniques in the streaming setting as well as other distributed settings such as MapReduce.
Giant online advertisement markets such as Google, Bing and Facebook raised up several interesting allocation problems. Usually, in these applications, we need to make the decision before obtaining the full information of the input graph. This enforces an uncertainty on our belief about the input, and thus makes the classical algorithms inapplicable. To address this shortcoming online algorithms have been developed. In online algorithms again the input is a sequence of items. Here the algorithm needs to make an irrevocable decision upon arrival of each item. In the second part of this dissertation, we aim to achieve two main goals for each allocation problem in the market. Our first goal is to design models to capture the uncertainty of the input based on the properties of problems and the accessible data in real applications. Our second goal is to design algorithms and develop new techniques for these market allocation problems
- …